-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port support for commercial GeoIP2 databases from Logstash. #24889
Conversation
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
jenkins please test this |
thanks for this fix @nezirus, Unfortunately, we have no good way to test this against a read database. mind commenting on your configuration and experience when running with this fix and a non-lite version of the database? |
We use it exactly the same like non commercial databases (minus the explicit 'database_file' option): So, after installing the plugin on all nodes in cluster, copying (gzipped) commercial databases, we update the pipeline (actual grok rules ommited): PUT _ingest/pipeline/geoip-test
{
"description": "geoip-test",
"processors": [
{
"grok": {...},
"geoip": {
"field": "client_ip",
"database_file": "GeoIP2-City.mmdb.gz"
}
}
]
} This is still young, but we haven't noticed any issues or noticed errors in logs. |
Hello guys, have you had the time to review this. We're running this for the last week or so without any noticeable problems (terabytes of data have already passed through the system). Btw, for performance reasons, we're running it with 'ingest.geoip.cache_size' > 100k. |
thanks for following up @nezirus, after thinking about how we do not have test infrastructure to verify the unique references to the commercial databaseTypes, I think it would make more sense to move from the switch statements on let me know if that sounds reasonable to you! |
Sure, that makes sense, I've updated my branch. |
💪 Jenkins, please test this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just one nit.
@@ -57,8 +57,8 @@ | |||
public final class GeoIpProcessor extends AbstractProcessor { | |||
|
|||
public static final String TYPE = "geoip"; | |||
private static final String CITY_DB_TYPE = "GeoLite2-City"; | |||
private static final String COUNTRY_DB_TYPE = "GeoLite2-Country"; | |||
private static final String CITY_DB_TYPE = "-City"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is now a suffix, can you change the constant name to reflect that, eg CITY_DB_SUFFIX
? And same with country as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we go, renamed in my branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sweet!
LGTM
* Port support for commercial GeoIP2 databases from Logstash. * Match GeoIP databases according to the database name suffix. * Rename CITY/COUNTRY_DB_TYPE, since they are suffixes now.
merged these changes into |
* master: (27 commits) Refactor TransportShardBulkAction.executeUpdateRequest and add tests Make sure range queries are correctly profiled. (elastic#25108) Test: allow setting socket timeout for rest client (elastic#25221) Migration docs for elastic#25080 (elastic#25218) Remove `discovery.type` BWC layer from the EC2/Azure/GCE plugins elastic#25080 When stopping via systemd only kill the JVM, not its control group (elastic#25195) Remove PrefixAnalyzer, because it is no longer used. Internal: Remove Strings.cleanPath (elastic#25209) Docs: Add note about which secure settings are valid (elastic#25212) Indices.rollover/10_basic should refresh to make the doc visible in lucene stats Port support for commercial GeoIP2 databases from Logstash. (elastic#24889) [DOCS] Add ML node to node.asciidoc (elastic#24495) expose simple pattern tokenizers (elastic#25159) Test: add setting to change request timeout for rest client (elastic#25201) Fix secure repository-hdfs tests on JDK 9 Add target_field parameter to gsub, join, lowercase, sort, split, trim, uppercase (elastic#24133) Add Cross Cluster Search support for scroll searches (elastic#25094) Adapt skip version in rest-api-spec/test/indices.rollover/20_max_doc_condition.yml Rollover max docs should only count primaries (elastic#24977) Add remote cluster infrastructure to fetch discovery nodes. (elastic#25123) ...
Could the commercial ISP-database (GeoIP2-ISP.mmdb) please be ported as well? It's working great with logstash, but I'm trying to eliminate logstash in my workflow entirely. |
Port support for commercial GeoIP2 databases from Logstash, City and Country for now. If there is interest, I'll submit ISP database in separate pull request.